Churn prediction in telecom using Random Forest and PSO based data balancing in combination with various feature selection strategies

نویسندگان

  • Adnan Idris
  • Muhammad Rizwan
  • Asifullah Khan
چکیده

The telecommunication industry faces fierce competition to retain customers, and therefore requires an efficient churn prediction model to monitor the customer’s churn. Enormous size, high dimensionality and imbalanced nature of telecommunication datasets are main hurdles in attaining the desired performance for churn prediction. In this study, we investigate the significance of a Particle Swarm Optimization (PSO) based undersampling method to handle the imbalance data distribution in collaboration with different feature reduction techniques such as Principle Component Analysis (PCA), Fisher’s ratio, F-score and Minimum Redundancy and Maximum Relevance (mRMR). Whereas Random Forest (RF) and K Nearest Neighbour (KNN) classifiers are employed to evaluate the performance on optimally sampled and reduced features dataset. Performance is evaluated using sensitivity, specificity and Area under the curve (AUC) based measures. Finally, it is observed through simulations that our proposed approach based on PSO, mRMR, and RF termed as Chr-PmRF, performs quite well for predicting churners and therefore can be beneficial for highly competitive telecommunication industry.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

Application of Feature Extraction Method in Customer Churn Prediction Based on Random Forest and Transduction

With the development of telecom business, customer churn prediction becomes more and more important. An outstanding issue in customer churn prediction is high dimensional problem. Curse of dimensionality will easily occur if effective feature extraction is not applied during modeling. Among the most popular feature extraction approaches, principal component analysis (PCA) method based on induct...

متن کامل

Predicting credit card customer churn in banks using data mining

In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligenc...

متن کامل

Dimensionality and data reduction in telecom churn prediction

Purpose – Churn prediction is a very important task for successful customer relationship management. In general, churn prediction can be achieved by many data mining techniques. However, during data mining, dimensionality reduction (or feature selection) and data reduction are the two important data preprocessing steps. In particular, the aims of feature selection and data reduction are to filt...

متن کامل

Neighborhood Cleaning Rules and Particle Swarm Optimization for Predicting Customer Churn Behavior in Telecom Industry

Churn prediction is an important task for Customer Relationship Management (CRM) in telecommunication companies. Accurate churn prediction helps CRM in planning effective strategies to retain their valuable customers. However, churn prediction is a complex and challenging task. In this paper, a hybrid churn prediction model is proposed based on combining two approaches; Neighborhood Cleaning Ru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computers & Electrical Engineering

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2012